-
Notifications
You must be signed in to change notification settings - Fork 2.4k
feat(gepa): add tool description optimization for multi-agent systems #8928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat(gepa): add tool description optimization for multi-agent systems #8928
Conversation
- Add optimize_tool_descriptions parameter (default False) to GEPA - Extract tool descriptions from all nested modules via named_sub_modules() - Apply optimized descriptions in DspyAdapter.build_program() - Enables holistic optimization of tools across main and subagent modules - Tests: 4 new tests, all 16 pass (4 new + 12 existing)
|
Apologies for accidentally closing #8927 Thank you for the thorough review, @LakshyAAAgrawal! I'll address your feedback:
I'll start working on items 1 and 2 and update the PR soon. Please let me know if you have any specific preferences for the tutorial format! |
|
Thanks a lot! For the tutorial, I think you can follow the current GEPA tutorial format (load a dataset, show an example from the dataset, build a dspy program, evaluate the baseline program on testset, run GEPA with new optimization settings, show the optimized programs' prompts and tool descriptions, and finally evaluate the optimized program). Hopefully we should be able to see a nice and large gain on agentic tasks with this amazing contribution by you! |
- Add ToolProposer with GenerateImprovedToolDescription signature - Implement routing logic to separate tools from signatures - Tools use ToolProposer, signatures use custom or parent default - Backward compatible: preserves existing custom_instruction_proposer behavior - Add test verifying routing splits components correctly
- Define tool functions outside class for clarity - Match structure of simple ReAct example - Add clear comments explaining architecture - Make code more readable and maintainable
197f077 to
c4f2041
Compare
|
Hi @LakshyAAAgrawal, I've implemented the tool-specific proposer as requested! Here's what's included: 1. Tool-Specific Proposer Implementation ✅
2. Documentation ✅
Reflection Prompt Design: Before I create a short tutorial (item #3), would you have any feedback on:
Any feedback would be helpful before I invest time in the tutorial. Thank you! |
|
wait there is a bug in the implementation working on it to fix. Also test has to be fixed. |
…euse Tools now copy ReAct's reflective data with tool-specific annotation instead of complex trajectory extraction. This 15-line approach reuses ReAct's existing context (thoughts, tool calls, observations) and adds focused annotation for each tool. Implementation: - Tools receive full ReAct reflective examples (same trajectory context) - Feedback prefixed: [Optimizing tool: 'X'] for focused optimization - Reflection LM sees complete multi-step execution traces per tool Benefits: - Simpler: 15 lines vs 70+ line extraction approach - Reuses code: No duplicate trajectory formatting logic - Same context: Tools see full ReAct execution traces - Clean: Removed all debug output Tests: - 4 focused tests following GEPA patterns (removed 1 redundant) - 226KB fixture with 34 LM + 6 reflection calls - All tests passing with gpt-5-nano traces Documentation: - Updated GEPA_Advanced.md with implementation details - Explains reflective dataset construction approach
|
|
||
| The `optimize_tool_descriptions` parameter enables GEPA to optimize tool descriptions in addition to signature instructions. This is particularly valuable for ReAct agents and other tool-using systems, where the quality of tool descriptions directly impacts the agent's ability to select appropriate tools for each task. | ||
|
|
||
| Unlike signature instructions that guide reasoning strategies, tool descriptions serve a fundamentally different purpose: they help agents decide **which tool to use** in a given situation. GEPA recognizes this categorical difference and applies a specialized reflection prompt tailored for tool selection decisions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which tool to use, when to use it, and how to use it. All three are captured by the description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's avoid the word "fundamentally". One can imagine that all of tool descriptions can (and many times do) simply included in the system prompt itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add a corresponding entry in GEPA Overview, that links to this file/section.
|
|
||
| Consider enabling `optimize_tool_descriptions=True` when: | ||
|
|
||
| - **Building ReAct agents**: ReAct agents rely on tool descriptions to make action selection decisions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One should consider using this, when they use dspy.Tool anywhere in the DSPy program. Here are a few scenarios for using dspy.Tool:
| ) | ||
| ``` | ||
|
|
||
| **Note:** Tool optimization is fully backward compatible. Existing programs without tools, or with `optimize_tool_descriptions=False`, continue to work exactly as before. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to inform users about backward compatibility here. It should be implicit that there should be no behaviour changes for any program not containing dspy.Tool.
dspy/teleprompt/gepa/gepa.py
Outdated
| raised if a mismatch in module-level and predictor-level score is detected. | ||
| optimize_tool_descriptions: Whether to optimize tool descriptions for modules with tools | ||
| (e.g., ReAct agents). When enabled, tool descriptions are included in the optimization | ||
| process alongside signature instructions. Default is False. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a link to GEPA Advanced/Tool section
dspy/teleprompt/gepa/gepa_utils.py
Outdated
| ) | ||
|
|
||
| self.propose_new_texts = custom_propose_new_texts | ||
| elif self.optimize_tool_descriptions: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edge case: What should happen when user tries to provide both a custom proposer, and enables optimize_tool_descriptions
dspy/teleprompt/gepa/gepa_utils.py
Outdated
| # Handle signature components - replicate proposer's default behavior | ||
| sig_texts = {} | ||
| if sig_components: | ||
| from gepa.strategies.instruction_proposal import InstructionProposalSignature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a slight deviation from this PR, but would be a large enhancement (feel free to ignore):
- Create 2 fields, self.instruction_proposal_signature and self.tool_proposer, which are initialized to the default InstructionProposalSignature and ToolProposerSignature.
- Take an argument from dspy.GEPA that can override the default signature values.
dspy/teleprompt/gepa/gepa_utils.py
Outdated
| # Second pass: Process tools by copying ReAct data with annotation | ||
| react_module_name = None | ||
| for name in ret_d.keys(): | ||
| if "react" in name.lower(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this robust? Might it be better to use isinstance or some other way?
| Your task is to write a better description for this tool. | ||
| Read the examples carefully and identify patterns in when the tool was used successfully versus when it was misused or overlooked. Identify any domain-specific information about the tool's capabilities or appropriate usage that may not be available to the assistant in the future. The assistant may have developed effective patterns for tool selection - if so, ensure the tool description supports those patterns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tool use. Also suggest identifying any failure modes of the tool?
|
Dear @Ju-usc, This is a great PR. Thanks a lot! I have tried to be overly critical and made too many nits. Feel free to ignore if you disagree with something. Let me know if you'd like me to address anything! Regarding the meta prompt, overall I think it looks great. However, I suggest that as you build the tutorial, you may find that the reflection prompt needs tweaking, or the content exposed in reflective_dataset for the tool may be lacking or need improvement. This is going to be an empirical exercise, which will guide what works in the reflection meta prompts. ! Looking forward to the tutorial on this too! You may already have thoughts about what you'd like to show in the tutorial, but if not, you may consider building off (https://kargarisaac.medium.com/building-and-optimizing-multi-agent-rag-systems-with-dspy-and-gepa-2b88b5838ce2) by @kargarisaac. |
- Add GenerateImprovedToolDescriptionFromFeedback signature documentation - Include tool-aware metric example showing trajectory access - Document tool prefix annotation in feedback - Note component_selector applies to both signatures and tools - Fix 'fundamentally' language per reviewer feedback
- Separate Pass 1 (predictor examples) and Pass 2 (tool aggregation) - Clarify Generated Outputs includes full trajectory for ReAct - Fix feedback annotation format to [Tool 'name' from 'predictor_key'] - Add Component Identification & Proposer Routing section - Explain dual-proposer independence (custom proposer doesn't affect tool proposer) - Use consistent terminology: 'predictor' and 'signature instructions'
|
|
||
| #### Implementing a Custom Proposer for ReAct | ||
|
|
||
| If you need custom logic, you must handle ReAct components yourself. ReAct components are stored as JSON strings containing all 4 parts: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of saying this, we can say that you can start with the existing implementation at X.
|
Hey @Ju-usc, this looks great to me. Left a few comments, otherwise happy to merge! Can you address the ruff issues as well? |
…r_base_program for clarity
…for selective optimization
…Act component optimization
Update the ReAct proposer's reflection signature to guide the LM toward more
appropriate output granularity and selective optimization.
Changes:
- Add context that components are progressively optimized across iterations
- Change 'and' to 'and/or' for abstraction/specificity (allows flexibility)
- Refine field descriptions to guide output style:
* 'ReAct instruction for reasoning and tool selection' (functional context)
* 'Extract instruction for answer extraction' (functional context)
* 'Purpose of tool' (focuses on high-level what/why, not verbose how)
* 'Usage of parameter' (focuses on specific usage, not essay)
The goal is to prevent overly verbose LM outputs (multi-paragraph tool/param
descriptions) while preserving exploration capability. Field descriptions now
provide functional context ('for reasoning', 'purpose', 'usage') that naturally
guides appropriate scope without being prescriptive about format or length.
This allows the reflection LM to determine the right level of detail based on
what's needed to fix failures, aligned with GEPA's general meta-prompt philosophy.
Replace prescriptive 'minimize tool calls' example with educational progression that shows users how to write effective metrics without forcing specific objectives. Changes: - Show simple metric first (just correctness feedback) - Then show trajectory-based metric (accessing agent execution) - Use clear for-loop instead of list comprehension for readability - Follow DSPy docs conventions: answer_match variable, example/pred naming - Remove 'minimize tool calls' directive - let users decide their objectives - Add bullet points explaining what trajectory can reveal (tool selection, reasoning quality, efficiency) without prescribing how to use it - Rename section to 'Writing Metrics for ReAct Optimization' (more actionable) This aligns with GEPA's philosophy: provide general, extensible patterns that users can adapt to their specific needs. Detailed examples can be shown in tutorials rather than API documentation. Addresses PR review comment 5 about prescriptive objectives in documentation.
…duleProposer Address PR review comment 6 by simplifying the custom proposer documentation. Changes: - Replace long inline implementation example with clickable GitHub link - Point to ReActModuleProposer as reference implementation - Add bulleted list of what the reference shows (parsing, dynamic signatures, etc.) - Keep essential JSON structure and interface documentation - Remove 100+ lines of redundant code example Benefits: - Less overwhelming for users (no duplicate code) - Single source of truth (reference implementation) - Clickable link to actual working code on GitHub - Users can copy/modify real implementation instead of example Addresses PR comment from @LakshyAAAgrawal about using reference instead of full implementation example.
|
@LakshyAAAgrawal Thanks for the thorough review! Addressed all 6 comments:
Let me know if you have any other thoughts to move this PR forward! |
Improve the custom proposer documentation to be more user-friendly while
maintaining technical accuracy.
Changes:
- Warmer, more inviting opening ("best way to start")
- Concrete example with 'search' tool instead of generic placeholders
- Plain English explanations for each component ("How the agent reasons...")
- Clear separation: "What you can improve" vs "What to preserve"
- Simpler code example with inline comments explaining ReAct vs regular
- Concise "reference shows how to" bullets (3 key points)
- More approachable tone without sacrificing precision
This makes the advanced feature more accessible to users who need custom
optimization logic beyond the defaults.
Follows up on the previous commit addressing PR comment about custom proposer example.
5f3a9aa to
1b10b65
Compare
|
@LakshyAAAgrawal I have also updated the PR description to reflect all the changes. Ready for rereview! |
…ation Sync documentation with actual reflection prompt after bd4cdac: - Add 'These components are progressively optimized' context - Change to 'and/or specificity' for flexibility - Update output field types to 'str | None' with default=None - Refine field descriptions ('for reasoning and tool selection', 'for answer extraction') - Add note about dynamic field descriptions ('Purpose of tool', 'Usage of parameter') This ensures docs accurately reflect the current prompt design that guides appropriate granularity without being prescriptive.
|
When GEPA is invoked on a DSPy program which has ReAct module, but does not have "optimize_react=True" set, print a warning saying that "For ReAct programs, consider using |
Add warning message when GEPA detects ReAct modules in the program but optimize_react_components=False. This helps users discover the ReAct optimization feature. Changes: - Always traverse modules to detect ReAct instances - If optimize_react_components=False, warn for each ReAct module found - Shows module path to help users identify what would be optimized - No behavioral changes when optimize_react_components=True Addresses maintainer feedback to make the feature more discoverable.
|
@LakshyAAAgrawal accidentally hit "Close" while working on the warning feature. Reopened right away! Implemented the warning. Ready for re-review when you have a chance! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds comprehensive support for optimizing ReAct modules in GEPA, addressing a critical bug where module paths were being truncated during detection and reconstruction. The implementation enables joint optimization of all ReAct components (react instructions, extract instructions, tool descriptions, and tool argument descriptions) while preserving full module paths for correct identification in nested multi-agent systems.
Key Changes
- Added
optimize_react_componentsflag to enable ReAct module optimization in GEPA - Implemented
ReActModuleProposerfor joint optimization of ReAct components using dynamic signature generation - Fixed path truncation bug by preserving full module paths (e.g., "multi_agent.orchestrator" instead of "multi_agent")
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/teleprompt/test_gepa_react_optimization.py | Comprehensive test suite validating ReAct module detection, reconstruction, and reflective dataset creation with full path preservation |
| dspy/teleprompt/gepa/instruction_proposal.py | New ReActModuleProposer class and signature for optimizing ReAct components with dynamic tool field generation |
| dspy/teleprompt/gepa/gepa_utils.py | Enhanced DspyAdapter with ReAct-aware component routing, reflective dataset creation, and program building logic |
| dspy/teleprompt/gepa/gepa.py | Added ReAct module discovery, JSON serialization of module configs, and optimize_react_components parameter |
| docs/docs/api/optimizers/GEPA/overview.md | Brief overview of ReAct component optimization feature |
| docs/docs/api/optimizers/GEPA/GEPA_Advanced.md | Comprehensive guide on ReAct optimization including usage examples and custom proposer implementation |
Comments suppressed due to low confidence (1)
dspy/teleprompt/gepa/gepa_utils.py:1
- Corrected unmatched parenthesis - should be 'original)' → 'original'.
import json
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| tools_list = [] | ||
| for tool_name, tool_info in current_tools_dict.items(): | ||
| tool = dspy.Tool( | ||
| func=lambda: None, # Placeholder - Tool requires Callable, but only schema is used |
Copilot
AI
Nov 3, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lambda function closure issue: all tools will share the same lambda instance. While this works because the function is never executed (only schema is used), consider using a named placeholder function or lambda *args, **kwargs: None for clarity that this is intentionally a no-op.
| func=lambda: None, # Placeholder - Tool requires Callable, but only schema is used | |
| func=lambda *args, **kwargs: None, # Placeholder - Tool requires Callable, but only schema is used |
|
|
||
| import json | ||
|
|
||
| import dspy |
Copilot
AI
Nov 3, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Module 'dspy' is imported with both 'import' and 'import from'.
| try: | ||
| optimizer.compile(program, trainset=trainset, valset=trainset) | ||
| except Exception: | ||
| pass |
Copilot
AI
Nov 3, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| try: | |
| optimizer.compile(program, trainset=trainset, valset=trainset) | |
| except Exception: | |
| pass | |
| optimizer.compile(program, trainset=trainset, valset=trainset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This too?
|
|
||
| try: | ||
| optimizer.compile(program, trainset=trainset, valset=trainset) | ||
| except Exception: |
Copilot
AI
Nov 3, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| except Exception: | |
| except Exception: | |
| # Exception during compile is ignored because the test proceeds with a mock optimized candidate. |
| try: | ||
| optimizer.compile(program, trainset=trainset, valset=trainset) | ||
| except Exception: | ||
| pass |
Copilot
AI
Nov 3, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| try: | |
| optimizer.compile(program, trainset=trainset, valset=trainset) | |
| except Exception: | |
| pass | |
| optimizer.compile(program, trainset=trainset, valset=trainset) |
|
|
||
| try: | ||
| optimizer.compile(program, trainset=trainset, valset=trainset) | ||
| except Exception: |
Copilot
AI
Nov 3, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| except Exception: | |
| except Exception: | |
| # GEPA optimizer may raise during compilation; ignore to allow inspection of captured base program. |
|
|
||
| try: | ||
| optimizer.compile(program, trainset=trainset, valset=trainset) | ||
| except Exception: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we expect any exception to actyally be raised here? If yes, we should catch exactly that exception.
|
|
||
| try: | ||
| optimizer.compile(program, trainset=trainset, valset=trainset) | ||
| except Exception: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly here
|
Hi @Ju-usc, Echoing some discussion with @chenmoneygithub here. One question we have is how tied is this to ReAct modules? Let's say I have a custom dspy Module, that is not a dspy.ReAct, but uses dspy.Tool, will these changes still be applicable? If yes, it makes sense to change |
chenmoneygithub
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
At a high level I really like what is achieved by this PR, being able to optimize tool description is a big gain. However, I don't think we should engineer towards ReAct, which is just one agent structure for tool calling. Bias towards ReAct makes the added code easy to go stale. Instead, I recommend targetting the optimization at tools, basically for any dspy Module, self.tools are subject to optimization.
| custom_instruction_proposer=self.custom_instruction_proposer, | ||
| warn_on_score_mismatch=self.warn_on_score_mismatch | ||
| warn_on_score_mismatch=self.warn_on_score_mismatch, | ||
| optimize_react_components=self.optimize_react_components, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of engineering this towards dspy.ReAct, I recommend covering tool calling in general: optimize_react_components => optimize_tools. ReAct is just one way for tool calling agent, and it's quite common for users to make customizations, and we may create other tool calling agent architectures in the near future.
| - pred: The predicted output. | ||
| - trace: Optional. The trace of the program's execution. | ||
| - pred_name: Optional. The name of the target predictor currently being optimized by GEPA, for which | ||
| - pred_name: Optional. The name of the target predictor currently being optimized by GEPA, for which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not mix style/lint fixes with actual code change in the same commit for clean commit history.
| comp_name = pred_name if not normalized_path else f"{normalized_path}.{pred_name}" | ||
| # Use full normalized path to avoid collapsing nested modules | ||
| # e.g., "multi_agent.coordinator" not "multi_agent" | ||
| module_key = f"{REACT_MODULE_PREFIX}:{normalized_path}" if normalized_path else REACT_MODULE_PREFIX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we need this custom name change - are you seeing duplicated keys without this code? could you share an example? If so, that's a bug we should fix on the module level.
| logger.info(f"Initialized base_program with {len(base_program)} components:") | ||
| for key in sorted(base_program.keys()): | ||
| if key.startswith(REACT_MODULE_PREFIX): | ||
| logger.info(f" {key}: <ReAct module JSON config>") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would get rid of these logs. These are internal implementation details, so users won't be able to follow
| Generated_Outputs: dict[str, Any] | str # Success: dict with output fields, Failure: error message string | ||
| Feedback: str # Always a string - from metric function or parsing error message | ||
|
|
||
| Inputs: dict[str, Any] # Predictor inputs (may include str, dspy.Image, etc.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: separate style change to a separate PR.
| def custom_propose_new_texts( | ||
| self.optimize_react_components = optimize_react_components | ||
|
|
||
| def build_propose_new_texts(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is huge now, let's move the helper function out as private method.
| # Init ReAct module proposer if tool optimization is enabled | ||
| react_module_proposer = None | ||
| if self.optimize_react_components: | ||
| from .instruction_proposal import ReActModuleProposer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: always use absolute imports
| if self.reflection_lm is not None: | ||
| with dspy.context(lm=self.reflection_lm): | ||
| return self.custom_instruction_proposer( | ||
| results.update( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can merge the if and else branch by something like:
with dspy.context(lm=self.reflection_lm or dspy.settings.lm):
|
@LakshyAAAgrawal @chenmoneygithub Thank you so much for the awesome detailed feedback! I'll address all the comments soon. Love to polish this in the right direction! Before to refactor architecture, i have a quick clarification on custom tool usage: For custom modules using dspy.Tool (not dspy.ReAct), is it safe to assume they use dspy.ToolCalls as output from the predictor to handle tool calls? This would help determine a clean approach for generalizing the optimization, especially for constructing shared tracing for reflective dataset to jointly optimize predictor instruction that uses tools and tools descriptions. |
|
@Ju-usc Sorry about the late reply!
Unfortunately we cannot make this assumption. Help me understand - why do we need to care about the output type? I thought we only need to expose the Tool as GEPA candidate, and have GEPA propose new rules for Tool based on the feedback. |
Summary
Addresses #8706 which requested GEPA to optimize tool descriptions. This PR expands on that to enable comprehensive ReAct module optimization with joint optimization of all four ReAct components: react instructions, extract instructions, tool descriptions, and tool argument descriptions.
When
optimize_react_components=True, GEPA discovers alldspy.ReActmodules in your program (including nested multi-agent systems) and uses a specialized reflection prompt to jointly optimize how agents reason, select tools, and extract answers from execution trajectories. All ReAct components are optimized together based on shared execution traces, enabling the reflection LM to generate cohesive instructions since it sees how components work together (not optimized in isolation). This addresses the ReAct trajectory prefix duplication issue (gepa-ai/gepa#97).Fully backward compatible - Default
optimize_react_components=Falsepreserves existing behavior.Issue
Closes #8706 - Original request was to enable GEPA to optimize tool descriptions. This PR expands on that to optimize all four ReAct components jointly (react instructions, extract instructions, tool descriptions, and tool argument descriptions) for more effective agent optimization.
Changes
Core Implementation
optimize_react_componentsparameter to GEPA (defaultFalsefor backward compatibility)dspy.ReActas one module with react/extract/tools as subcomponents, respecting both GEPA's module-level abstraction and DSPy's ReAct module designReActModuleProposerwith dynamic signatures - Specialized proposer that generates output fields for each tool/parameter, enabling selective optimizationnamed_sub_modules()to find alldspy.ReActinstances (supports deeply nested multi-agent architectures)ReActModuleProposer, regular predictors to default/custom proposersTesting
Documentation
GEPA_Advanced.md- Complete ReAct optimization guide:overview.md- Brief introduction linking to advanced guideUsage Example
Basic ReAct Agent
Multi-Agent System
Key Features
Joint Optimization:
Selective Optimization:
Nonefor components that should stay unchangedMulti-Agent Support: